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r | \ Abstract 

HH ■ 
c/5 ' This paper considers transmission schemes in multi-access relay networks (MARNs) where J single- 

i__i \ antenna sources send independent information to one iV-antenna destination through one A/-antenna re- 

pg \ lay. For complexity considerations, we propose a linear framework, where the relay linearly transforms 

its received signals to generate the forwarded signals without decoding and the destination uses its multi- 

t— I , antennas to fully decouple signals from different sources before decoding, by which the decoding complex- 

es ■ 
»yv . ity is linear in the number of sources. To achieve a high symbol rate, we first propose a scheme called 

(~~. ■ Concurrents^R^D-ICo in which all sources' information streams are concurrently transmitted in both the 

source-relay link and the relay-destination link. In this scheme, distributed space-time coding (DSTC) is 
applied at the relay, which satisfies the linear constraint. DSTC also allows the destination to conduct the 
zero-forcing interference cancellation (IC) scheme originally proposed for multi-antenna systems to fully 

y\ ' decouple signals from different sources. Our analysis shows that the symbol rate of Concurrents^R->D-ICo 

is 1/2 symbols/source/channel use and the diversity gain of the scheme is upperbounded by M — J + 1. To 
achieve a higher diversity gain, we propose another scheme called ConcurrentR^D-ICo in which the sources 
time-share the source-relay link. The relay coherently combines the signals on its antennas to maximize the 
signal-to-noise ratio (SNR) of each source, then concurrently forwards all sources' information. The desti- 
nation performs zero-forcing IC. It is shown through both analysis and simulation that when N > 2 J — 1, 
ConcurrentR_>D-ICo achieves the same maximum diversity gain as the full TDMA scheme in which the 
information stream from each source is assigned to an orthogonal channel in both links, but with a higher 
symbol rate. 

Index Terms: Multi-access relay network, distributed space-time coding, interference cancellation, orthogonal 
and quasi-orthogonal designs, cooperative diversity. 
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1 Introduction 

Node cooperation improves the reliability and the capacity of wireless networks. Recently, many cooperative 
schemes have been proposed, and their multiplexing and diversity gains are analyzed [fl~r|4]]. However, most 
pioneer works in this area focus on cooperative relay designs without multi-user interference. It is assumed 
that there is a single transmission at a time or orthogonal channels are assigned to different transmissions, 
e.g. (T]3- As a general network has multiple nodes each of which can be a data source, allocating an orthogonal 
channel to the information stream of each source is bandwidth inefficient. Therefore, concurrent transmission of 
information streams from multiple sources is desirable in cooperative networks to improve spectrum efficiency. 
Some examples on the design and performance analysis of multi-source transmission can be found in lHHH. 

One model on multi-source transmission is the interference relay network 0. Multiple parallel commu- 
nication flows are supported by a common set of cooperative relays through two hops of transmission. Each 
source targets at one distinct destination. Two schemes using relays to resolve interference were discussed. The 
zero-forcing (ZF) relaying scheme designs scalar gain factors at single-antenna relays to null out interference 
at undesired destinations |[T0Ul2~ll . The minimum mean square error (MMSE) relaying scheme designs scalar 
gain factors to minimize interference-plus-noise power at undesired destinations 111311141 . However, both relay- 
ing schemes assume that the gain factors are first calculated at one centralized node having perfect and global 
channel state information (CSI), then fed back to the relays. While papers lflOUl4l discuss the multiplexing 
gain and designs of the optimal scalar gain factors, they do not provide diversity analysis. An interference relay 
network with multi-antenna nodes was discussed in lfI31 . in which the authors used maximum-ratio-combining 
(MRC), ZF, and MMSE relaying schemes and analyzed the power-bandwidth trade-off of the network. 

Another model on multi-source cooperative communication considers the scenario where several sources 
target at one multi-antenna destination with the help of one multi-antenna relay. The network is called multi- 
access relay network (MARN) lfl6i We use the notation lj x M\ x N\ to represent the MARN with J 
single-antenna sources, one M-antenna relay, and one A-antenna destination. For the MARN, the source-relay 
link is a multi-access channel (MAC) and the relay-destination link is a point-to-point multiple input multiple 
output (MIMO) channel. The MARN is thus essentially a serial concatenation of the MAC and the MIMO. 
Both links have the potential for multi-source concurrent transmission, i.e., information streams from different 
sources can be simultaneously transmitted on the same channel. An intuitive scheme is to allow information 
streams from different sources concurrently transmitted in both links and jointly decode all sources' information 
at the relay and the destination. Single source transmission schemes, e.g., distributed space time code (DSTC), 
can be applied straightforwardly following this idea by treating signals from different sources jointly as a higher 
dimension signal vector. It can be shown that this scheme achieves a symbol rate of 1/2 symbols/source/channel 

2 



use and the maximum diversity gain of M. However, with such a scheme, the decoding complexities at the 
relay and the destination are exponential in the number of sources, thus may become infeasible for networks 
with a large number of sources. For complexity considerations, we propose a linear framework for MARNs. 
The relay linearly transforms its received signals to generate the forwarded signals without decoding. The 
destination separates signals from different sources before the ML decoding of each source's information. The 
decoding complexity at the destination is hence linear in the number of sources. To the best of our knowledge, 
MARNs with this linear framework have not been explicitly discussed in the literature. It is noteworthy that 
this linear framework may constrain the network optimality in some performance measures. 

For single-source two-hop cooperative networks, DSTC can achieve the maximum diversity gain without 
any CSI at the relay iMTl . For the multi-source scenario, one can use DSTC at the relay and assign the infor- 
mation stream of each source an orthogonal channel in both links. This scheme is denoted as TDMAs^r^d. 
whose achievable diversity gain is M for \j x M\ x N\ MARNs |[T8l . Since interference is avoided in both 
links, we call this maximum diversity gain the interference-free (int-free) diversity gain. It provides a natural 
upperbound on the diversity gain for any multi-source transmission scheme that allows concurrent transmission 
of information streams from different sources. However, TDMAs->r^d has low spectrum efficiency when the 
number of sources is large. In lfl6l . we proposed a multi-source transmission scheme called IC-Relay-TDMA, 
in which concurrent multi-source transmission is allowed in the source-relay link only. The relay, knowing the 
source-relay channel, performs linear interference cancellation (IC) lfT9W2Tl to decouple signals from different 
sources. In the relay-destination link, the relay forwards information of different sources to the destination 
using TDMA. To adopt the same naming system, this scheme is denoted as Concurrents_>R-ICR instead in 
this paper. For a Lj x Mi x N\ MARN, Concurrents^R-ICR achieves the maximum int-free diversity when 
N < L (l — ^ff-) ifToll . For the MARN considered in this paper, i.e., each source has only one antenna, 
Concurrents-^ TCr only achieves a diversity gain of M — J + 1, hence cannot achieve the maximum int-free 
diversity gain. Also, the TDMA method in the relay-destination link limits the symbol rate of the network. 

The Concurrents_>R-ICR scheme uses the relay to remove interference from different sources. For the 
considered MARN, the multi-antenna destination also has the capability of IC. In this paper, we propose two 
schemes in which IC is conducted at the destination rather than the relay. This is desirable for networks with 
powerful destinations such as the uplink of cellular systems. The first protocol allows information streams from 
different sources simultaneously transmitted in both links. The relay conducts DSTC to linearly transform its re- 
ceived signals without decoding. The destination performs IC to separate signals from different sources. Hence, 
this protocol is called Concurrents^R^D-ICr,. For the second protocol, the sources time-share the source-relay 
link. The relay obtains soft-estimates of the symbols from each source by MRC, encodes soft-estimates of each 

source by one DSTC, then concurrently forwards all sources' DSTCs. The destination performs IC to decouple 
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signals from different sources. Since information streams of different sources are simultaneously transmitted in 
the relay-destination link only, we call this protocol Concurrent r^d-ICu- A brief comparison of the proposed 
protocols with TDMAs->r^d and Concurrents_s.RTCR in symbol rate, diversity gain, and CSI requirements is 
illustrated in Table [Q Contributions of the proposed protocols are summarized as follows. 

1. The proposed protocols fit the linear framework: linear processing without decoding at the relay and 
linear decoding complexity in the number of sources at the destination. Furthermore, they are applicable 
to the interference relay network. 

2. CSI feedback, which is necessary for ZF and MMSE relaying schemes lfT0llTTl[T3llT4ll . is not required for 
the protocols proposed in this paper. 

3. We perform rigorous analysis on the diversity gain of the proposed protocols, which to the best of our 
knowledge, is not provided for related work on multi-source cooperative networks. 

4. Concurrents_>R^D-ICD achieves a symbol rate of 1/2 symbols/source/channel use, the highest among 
the linear schemes in Table [Q Since the symbol rate of each source is independent of the number of 
sources, the throughput of the network grows linearly with the number of sources without increasing the 
bandwidth. With rigorous analysis, the diversity gain is shown to be upperbounded by M — J + 1. 

5. ConcurrentR^oTCD achieves a symbol rate of j^-j- symbols/source/channel use in conjunction with a 
diversity gain of min{M, [^j-\(N — J + 1)} {[x\ denotes the maximum integer not greater than x). 
When N > 2 J — 1, ConcurrentR^DTCo achieves the maximum int-free diversity gain. Compared with 
TDMAs_>.r_>.d> it has a higher symbol rate with no penalty on the diversity gain for networks satisfying 
N > 2 J — 1. Compared with Concurrents_>R-ICR lfT6ll . it has the same symbol rate but has advantage in 
the diversity gain for the 1 j x Mi x iVi MARN. 

The rest of the paper is organized as follows. Section |2] introduces the network model. Section [3] presents 
Concurrents_>R_>DTCD and analyzes its diversity gain. In Sectional ConcurrentR_>D-ICD is proposed and its 
performance is studied. Section [5]provides the numerical results. Conclusions are given in Section [6l Involved 
proofs are presented in appendices. 

Notation: For a matrix A, denote its (i,j)th entry as aij. A*, A*, and A are the transpose, Hermitian, 
and conjugate of A, respectively. || A|| is the Frobenius norm of A. For two matrices A and B of the same 
dimension, A y B means that A — B is positive definite. I n is the n x n identity matrix. mn is the m x n 



IM 
the expected value of the random variable x. 



matrix of all zeros. When m = n, nn is simplified as n . f(x) = o(x) means lim £-&- = 0. E [a;] denotes 

x— >0+ 



2 Network Model 

Consider a MARN with J single-antenna sources, one M-antenna relay, and one iV-antenna destination, where 
there is no direct connection from the sources to the destination. This MARN is denoted as a 1 j x M\ x N\ 
MARN. We further assume that both the numbers of relay antennas and destination antennas are no less than the 
number of sources, i.e., J < min{M, N}. This condition is to guarantee full IC at the destination, the details 
of which will be shown later. The condition can be realized by user admission control in the upper-layers. We 
assume that both the relay and the destination know the value of J. 

Denote the channel coefficient from Source j (j = 1, . . . , J) to the i-th (i = 1, . . . , M) relay antenna as 
/j , and the channel coefficient from the i-th relay antenna to the re-th (n = 1, . . . , N) destination antenna 
as gi n . Assume that all channel coefficients are i.i.d. circularly symmetric CM(0, 1) distributed. In addition, 
we assume a block-fading model with coherent interval T. The noises at each relay antenna and destination 
antenna are modeled as additive white Gaussian noise (AWGN) with zero mean and unit power. Throughout 
the paper, we assume global and perfect CSI at the destination. The CSI requirement at the relay depends on 
the scheme. In Section [3] the proposed protocol does not need any CSI at the relay; in Section [4] the relay 
needs only backward CSI, i.e., the channel information from all sources to itself. The required backward CSI 
can be acquired by training [18,22]. No CSI feedback is required for either protocol. To focus on the diversity 
gain performance, we assume that all sources and the relay have the same average power constraint. Further, 
all nodes are assumed to be perfectly synchronized at the symbol level. 

3 The Protocol of Concurrents^R^D-ICo 

In this section, we propose a protocol that allows concurrent transmission of information streams from dif- 
ferent sources in both the source-relay link and the relay-destination link to achieve the symbol rate of 1/2 
symbols/source/channel use. The protocol is thus called Concurrents^R-s-D-ICD- Based on the linear frame- 
work introduced in Section 1, we need to design the linear signal processing at the relay and the destination. 
Since DSTC requires only a linear transformation at the relay and achieves the maximum diversity gain in 
single-source relay networks ElTFTl . we propose to use DSTC for the MARN to gain protection against channel 
fading. At the destination, the IC method 111914211 . originally proposed for multi-user MAC to decouple interfer- 
ing signals IT231 . is used to separate information of different sources. In Subsection 13. II we describe the details 
of the protocol. Subsection [3^2] provides the diversity gain analysis. Subsection [33] contains the discussion on 
the condition for full IC at the destination and the symbol rate. 



3.1 Protocol Description 

We first describe Concurrents^R_>D-ICD in the 1 2 X 2\ x N\ MARN with two single-antenna sources, one 
double-antenna relay, and one iV-antenna destination; then consider the 1 2 x 4i x N\ MARN followed by its 
generalization to lj x Mi x N\ MARNs. 



3.1.1 Concurrent s ^R^D-IC D for the 1 2 x 2 X x iVi MARN 

The protocol of Concurrents_>.R_>.DTCD consists of two steps as shown in Fig. [Q During the first step, each 
source collects two symbols s^ and sJ independently and uniformly from the constellation S. Source j 
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transmits a vector of two symbols x"' = 

signal vector at the z-th relay antenna can be expressed as 



and both sources transmit concurrently. The received 
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where v, denotes the 2x1 AWGN vector at the z-th relay antenna. The relay uses Alamouti DSTC lfi~71l to 
generate its output signal vector at the i-th antenna as, 
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where 



4P + 2 
normalizes the average power at the relay to P and Aj and Bj are the 2x2 encoding matrices 
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based on Alamouti design 
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From ©, the output vector tj is a linear transformation of the input vector r-j, i.e., the relay signal processing is 
a linear transformation. Since this linear transformation is independent of the channels, the relay does not need 
any CSI. 

During the second step, the relay transmits tj from its z'-th antenna, and ti and t 2 are concurrently transmit- 
ted. Denote the sampled signal at the n-th antenna of the destination and time slot r as x Tn . Using the special 
structure of the Alamouti design, an equivalent system can be obtained as 



(4) 



where w Tn denotes the AWGN at the ?7,-th destination antenna and time slot r and u n denotes the equivalent 
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noise vector at the n-th antenna of the destination. The 2x2 equivalent channel matrix H„ for Source j has 
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Note that the equivalent system equation in Q is similar to that of a MAC with two double-antenna users 
except that the noise vector is correlated. Using the IC techniques proposed for MAC in GUI , the destination 
can fully decouple the information streams from different sources and separately decode the information of 
each source. Without loss of generality, we discuss how the destination decodes the information of Source 1. 

2jj(2)» 2H <2) * 

To cancel the symbols of Source 2, the destination calculates x„ = i|ii( g )||o x„ — - — $-^x N for n = 1, . . . , N — 1. 
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Define x = [x*, . . . , x^]*, which is a 2N x 1 vector, and X = [x^, . . . , x^ r _ 1 ]*, which is a (2iV - 2) x 1 
vector. The IC process can be represented in a matrix form as 
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where the (2N — 2) x 2N matrix B is the IC matrix, the 2N x 2 matrix Hi denotes the equivalent channel 
matrix for Source 1, and the (2N — 2) x 1 vector n denotes the remaining equivalent noise vector. B, Hi, and 
u are given as 
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It can be shown that the equivalent noise vector n is Gaussian but not white. With straightforward calcula- 
tion, the (2A^ — 2) x (2N — 2) covariance matrix of n can be obtained as 
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Based on ((5]), Source l's information can be recovered using the maximum-likelihood (ML) decoding rule 
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Next, we show that the ML decoding in (© can be decoupled into two symbol-wise ML decodings. It 
suffices to show that H^B*R ri 1 BHi is a diagonal matrix. Notice that Alamouti structure |[24l is closed under 
matrix addition, matrix multiplication, and scalar multiplication. Since the 2x2 submatrices in B, Hi, and G 
have Alamouti structure from © and ®, the matrix H^B*R ri 1 BHi also has Alamouti structure in addition 

to being Hermitian. Generally, it can be shown that any Hermitian Alamouti matrix is diagonal with equal 
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diagonal entries. Therefore, H|B*R n 1 BH i is diagonal. The ML decoding in Q can be decomposed to two 
procedures of symbol-wise decoding as 

2 
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where hj denotes the 2-th column of Hi. Similarly, the destination can cancel the symbols of Source 1 and 
decode the information of Source 2. Four procedures of symbol-wise ML decoding are needed in total to 
decode both sources' information. 



3.1.2 Concurrent s ^R^D-IC D for the 1 j x 4i x Ni MARN 

This subsection describes ConcuiTents^.R^.D-ICD in the MARN with four relay antennas and J sources. During 



,(?) Ji) li) „Cj) 



'i 



, and all 



the first step, Source j transmits a 4 x 1 vector consisting of four symbols, i.e., 
sources transmit concurrently. The i-fh relay antenna receives a 4 x 1 vector r-j. The relay performs DSTC 
with quasi-orthogonal design lfi"7l . The 4x1 forwarded vector tj is generated as tj = c (Ajr, + Bjf7), where 
c = 



' 4(jp+i) * s t0 constrain the power of the relay to P; Aj and Bj are DSTC encoding matrices with 
quasi-orthogonal design II 17 11241 : 
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A 2 = A s 



Bi = B 4 = 4 .(10) 



During the second step, the relay concurrently forwards tj. The received signal at the ra-th destination antenna 
can be written as 

r a) in in a)~\ r j^ i r 
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where x Tn denotes the sampled signal at the n-th destination antenna and time slot r. It can be observed that 
S^) has quasi-orthogonal structure due to the DSTC at the relay. Using the IC techniques in |2"T1 . we can break 



%ln 




%0) 


s 2 ' 


~ s 3 


JiY 

S A 




J\ 9\n 




X2n 
%3n 


=vpcy: 

j=l:J 


Ji) 

s 2 
s 3 


JJ) 
s l 

~Jf) 

b 4 


JJ) 

S 4 - 

6 i ■ 


Si) 

~ s 3 

M 

b 2 




h 92n 
h 93n 


+ c 


%4n 


V. 


„0) 

.4 


Jj) 
s 3 


s 2 


Si) 
1 




U 94n 











"I 




gin 




Wl n 




92n 


+ 


W2n 




93n 




W 3n 




9An 




Win 



the system into two equivalent Alamouti systems as 
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where u+ and u^ denote the equivalent noise vector for each system. They have the following expressions: 



("11 +V4l)gin 


+ 


(«21 - ^3l)5l^ 




(un -vn)gi n 


4. 


(«2i + "31 )3i7T 





(-"22 +W 3 2)32« 
("12 + "42)32^ 

(-U22 - t"32~)32n 
("12 - "42)32^ 



(-"33 + V23)33n 
(-V43 - W3)337r 

(-V33 - V 23 )33n 

(-V43 + "13)33^ 



("44 + "I4)34n 
(-"34 + "24)34Tr 

("44 - "I4)34n 
(-"34 - "24)34^ 






U>4„ 



Win - W 4 „ 
W27T+ W^TT 



The destination uses J — 1 antennas to cancel the symbols of J — 1 interfering sources by the multi-user IC 
technique in ll20l for each Alamouti system in (ITTT > and (fT2~l) . thus decouple information streams from different 
sources. The destination then recovers information of each source separately using the ML decoding. Since the 
detailed formulas can be found in ETTl . we do not repeat the IC procedure here. 



3.1.3 Concurrent s ^R^ D -IC D for \j x M x x N ± MARNs 

To use the protocol in MARNs with a general M, each source transmits a vector of 2 n symbols, with 2 n the 
minimum number that is no less than M. The relay designs the DSTC using the first M columns of a 2 n x 2™ 
quasi-orthogonal space-time block code (STBC) with ABBA structure M251I26II . The destination separates the 
system into 2 n ~ 1 Alamouti systems and decouples the signals from different sources using the IC procedure 
in H161I21I . Each source's information can be decoded separately at the destination. The decoding complexity 
is thus linear in the number of sources. 



3.2 Diversity Gain Analysis 

In this subsection, we analyze the diversity gain of Concurrents->R->D-ICD- Due to the concatenation of the 
channels, a direct diversity analysis from the system equation (f5]) is challenging. Instead, we work on an 
equivalent representation for the tractability of the analysis. The equivalent representation captures the effect 
of the IC at the destination to the first step of transmission. For Concurrents-s.R-s.DTCD, although the ZF IC 
procedure is conducted at the destination, there is a virtual ZF at the relay and a dimension reduction filtering 
at the destination. We first derive this system representation in Subsection 13.2. II then prove an upperbound of 
the diversity gain on Concurrents-*. r->dTCd in Subsection 13.2.21 



3.2.1 An equivalent representation for Concurrents^R^D-ICo in 1/ x M\ x N\ MARNs 

Since the network parameters and the processing at the relay and the destination are statistically equivalent for 
all sources, the diversity gains of all sources are identical. Thus, we only focus on Source 1. 

For the simplicity of the presentation, we first look at the system equation of the I2 x 2\ x N± MARN in 
©. Notice that each entry in the channel matrix BHi is a rational function of the channel coefficients of both 
links. Then, the entries are neither independent nor Gaussian. This complicates the diversity gain analysis. In 
the following, we derive an equivalent system representation to decouple the channel concatenation, which will 
help the diversity gain analysis. The system equation in (|5]) can be rewritten as 
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. Note that the IC matrix B zero-forces the channels 



where G is defined in ® and F^ = 

ff> ft 

of Source 2, i.e., BGF^ 2 ) = 0. In other words, BG nulls out F^ 2 ). Then, the rows of BG are in the null space 
of the column space of F^ 2 ). Therefore, the channel matrix in (fT3l is invariant if FW is first projected onto the 
null space of F^ 2 -*, i.e., BGF = BGfF , where $ is the projection matrix to the null space of F^ 2 ', i.e., 
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tr(F (2) F (2)»)- 



$ = I4 — ^S', v Thus, (fT31 can be rewritten as 
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This new system equation can be interpreted as follows. Symbols s\ and s 2 are first transmitted through 
channel F^ to the relay. Then, ZF operation 3> is conducted to null out the information of Source 2. After 
that, signals are forwarded through channel G and the destination applies a filter B to reduce the dimension of 
the received signal vector from 2A?" x 1 to 2(iV — 1) x 1. The virtual ZF at the relay and the dimension reduction 
at the destination are due to the ZF IC at the destination. A diagram illustrating this process is shown in Fig. [2] 
For general J and M , with the same argument, the ZF at the destination induces a virtual ZF at the relay 
followed by a dimension reduction at the destination. It can be shown that the dimensions of F^, G, B are 
2 ra+1 x 2 n , 2 n N x 2 n+1 , and 2 n (N — J + 1) x 2 n N, respectively, where 2™ is the minimum number no less 
than M. The virtual ZF operation at the relay nulls out the information from Sources 2 to J. The dimension 
reduction filter B decreases the dimension of the received signal vector from 2 n N x 1 to 2 n (N — J + 1) x 1. 
Although the new system representation looks more complicated than the original one, it simplifies the diversity 
gain analysis. 
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3.2.2 Diversity gain upperbound 

Diversity gain is defined as the negative of the asymptotic slope of the bit error rate (BER) with respect to the 

average transmit SNR in the high SNR regime. In ll27l . it is shown that for a communication system represented 

by the equation y = hs + n where h, s, and n are the channel vector, scalar symbol, and noise vector, 

respectively, diversity gain can be calculated using the outage probability of the instantaneous normalized 

receive SNR 7 as 

d=lirn l0g ^ <£) , (15) 

e->0+ log e 

where the instantaneous normalized receive SNR is denned as 7 = h*R I1 1 h and R n is the covariance matrix 
of n. This technique is usually easier than the direct calculation based on the error rate, thus is used in this 
paper to analyze the diversity gain of Concurrents-^ r-^d-ICd- Based on the equivalent system equation in (IT~4l >. 
the following theorem is proved. 

Theorem 1. In \j x M\ x N\ MARNs, the diversity gain of Concurrent s^r^d-ICd is upperbounded by 
M-J + l. 

Proof. See Appendix lAl □ 

Theorem [Q can be intuitively explained as follows. Since the first step transmission is a MAC with an 
M-antenna receiver and the virtual ZF operation at the relay requires the use of J — 1 antennas to null out the 
information of J — 1 sources, the diversity gain achievable after the virtual ZF is no higher than M — J + 1. 
The second step transmission and the dimension reduction at the destination cannot improve the diversity gain 
of the first step. Therefore, the protocol has at most a diversity gain of M — J + 1. When J = 2 and M = 2, 
the following diversity result can be obtained as a special case of Theorem [TJ 

Corollary 1. In the I2 x 2\ x N\ MARN, the diversity gain of Concurrents^R^o-ICo is upperbounded by 1. 

3.3 Discussion 

In this subsection, we discuss the properties of Concurrents^R-^D-ICD, including the condition on the network 

parameters for full IC, the symbol rate, and its comparison with existing schemes. Finally, we present its 

possible applications in the interference relay network. 

First we consider the condition on the network parameters to achieve full IC. From the IC procedure in lf2TTl . 

at least J receive antennas are required to decouple signals from J source. Thus, Concurrents_>R_>D-ICD 

requires the number of destination antennas to be no less than the number of sources. In addition, a condition 

on the number of relay antennas is required. To show this, we start with the example in the I2 x 2\ x N\ MARN. 
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From (fl4l) . the equivalent channel vector experienced by s\ in the source-relay link is the first column of F^- 1 ', 
i.e., [/} OO/2 ]*, which is a 4x 1 vector in a 2-dimension subspace. It is discussed in Subsection l3.2. 1 I that the 
IC operation at the destination creates a virtual ZF operation at the relay. Then, the equivalent channel vector at 
the relay can be projected onto the null space of the equivalent channel vector of at most one interfering source. 
In other words, the virtual ZF operation at the relay can null out interference from at most one source and 
the network can allow at most two sources to transmit simultaneously. In general lj x Mi x N\ MARNs, the 
equivalent channel vector at the relay is a 2 n+1 x 1 vector in a M-dimension subspace. The virtual ZF operation 
at the relay can null out at most M — 1 information streams from interfering sources. Thus, the number of relay 
antennas also needs to be no less than the number of sources. With Concurrents_>R_>D-ICD, tne 1 J x ^1 x -^1 
MARN admits at most min{M, N} sources to concurrently transmit information. 

Now we discuss the symbol rate of the scheme. The multi-source transmission in Concurrents->R->.DTCD 
improves the spectrum efficiency of the network. In the first step, each source sends a vector of T = 2 n symbols 
in T channel uses where 2 n is the minimum number no less than M. Using DSTC with quasi-orthogonal 
design at the relay ifTTl . another T channel uses are required for the second step. Overall, 2T channel uses are 
required to send T symbols from end to end. Thus, the symbol rate is 1/2 symbols/source/channel use, which 
is independent of the number of sources. 

Next, we compare the diversity gain, symbol rate, and CSI requirements at the relay of Concurrents-*, r^d- 
ICd with two existing schemes: TDMAs-^r-^d and Concurrents^R-ICR [16], that also fit the linear frame- 
work. The results are shown in Table [Q Concurrents^R^D-ICo achieves a higher symbol rate compared to 
Concurrent s _s>R-ICR and TDMAs_j.r_s.d- F° r Concurrent s _ ! .R_ ! .DTCD, the throughput of the network grows lin- 
early with the number of sources; while for the other two, the throughput of the network is j^j for Concurrents->R- 
ICr and 1/2 for TDMAs_>r_>d, which has an upperbound when J grows large. However, from Theorem [Q the 
diversity gain of Concurrents_>>R_>D-ICD is upperbounded by M—J+l, thus is inferior to TDMAs->r^d and no 
better than Concurrents->R-ICR. For Concurrents^R_>D-ICD, diversity gain degradation is necessary to trade for 
a higher symbol rate. Regarding CSI, the relay does not need any channel information for Concurrents^R^o- 
ICq or TDMA S ^ R ^ D , while for Concurrents^R-ICR, the relay needs to know the source-relay channels. 

Finally, Concurrents-^R-^D-ICD can be applied in more general network models. Concurrents-*. r_>dTCd 

can be used in MARNs with distributed relay antennas. From Q, the forwarded signal from a relay antenna 

tj only depends on its own received signal r-j. No cross-talk between relay antennas is needed. Thus, the relay 

antennas do not have to be collocated to conduct the scheme. It is the total number of relay antennas that matters. 

Furthermore, Concurrents-*. r^dTCd can also be straightforwardly used in the interference relay network with 

multi-antenna destinations. The protocol description shows that each destination can use its multi-antennas 

to decouple the information streams of multi-sources and decode the information of its interest as long as the 
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numbers of destination antennas and distributed relay antennas are no less than the number of sources. 

4 The Protocol of ConcurrentR^o-ICo 

Although Concurrents_i.R^D-ICD improves the spectrum efficiency of MARNs, it cannot achieve the maximum 
int-free diversity. In this section, we propose another protocol that has the potential of achieving the same int- 
free diversity gain but with a higher symbol rate compared to TDMAs^r^d- In lj x M\ x N\ MARNs, 
the source-relay link has M independent channel paths for each source and the relay-destination link has M N 
independent channel paths. The diversity gain is thus bottlenecked by the source-relay link. We propose to 
use TDMA in the source-relay link to achieve the maximum diversity gain, and in the relay-destination link, 
concurrent transmission of information streams from different sources is designed to improve the symbol rate. 
We denote this protocol as ConcurrentR^D-ICD- In Subsection 14.11 we present details of the protocol. We 
analyze the diversity gain of the protocol in Subsection 14.21 and compare it with other schemes in Subsection 



4.1 Protocol Description 

Since ConcurrentR^D-ICD uses TDMA in the source-relay link, the main challenge in the design is to allow 
concurrent transmission of multi-sources in the relay-destination link, and decouple the multiple information 
streams at the destination. Our proposed protocol requires the relay to know its channels with all sources, 
which can be obtained by training, and does not require CSI feedback. It fits the linear framework introduced 
in Section 1. In what follows, we first explain the protocol for the lj x (2J)i x N\ MARN, followed by the 
lj x (4J)i x (N)i MARN, then extend the design to the general case of lj x Mi x N\ MARNs. 



4.1.1 Concurrent R ^ D -IC D for the lj x (2J)i x Ni MARN 

In this subsection, we describe ConcurrentR_>DTCD for the MARN in which the number of relay antennas is 
twice that of the number of sources. The protocol of ConcurrentR^D-ICo consists of two steps as shown in 
Fig. [3] During the first step, two symbols randomly selected from one constellation S are collected by Source 



j to form a vector as s 



(j) 



'1 S 2 



Source j uses time slots 2j — 1 and 2j to send s^\ In other words, 



sources transmit to the relay in TDMA. In time slots (2j — 1) and 2j, the i-th relay antenna overhears 



r(2j-i)i 

r (2j)* 

V 

„0) 



Pf U) s U) + 



V(2j-l)i 
V(2j)i 

V 



1,...,2J, j = l,..., J, 
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where r T i and v T i denote the received signal and the AWGN at the f-th relay antenna and time slot r, respec- 
tively. The relay coherently combines signals at each antenna to maximize the SNR of Source fs transmission 
and obtains a soft estimate of s^> as, 



;(j) 



E /W 



j=l:2J 



E /, 

i=l:2J 



U) 



y/p a U) + l^l^L 



E /, w v?> 



E /, 



i=l:2J 



:•(./) 



(16) 



v(i) 



where the 2 x 1 noise vector v^ has i.i.d. CM { 0, I Yl fi ) entries. The relay uses Alamouti 



DSTC lfl7l to encode the soft estimate of s^' into 



*(2i-l) t( 2 j) 



P 



MP + M 



AifW B 2 f0) 



(17) 



where J mp+m 1S t0 consaa in the average relay power to P; the encoding matrices Ai and E$2 are given in 
©. From ([TBI ) and (fTTT ). the relay generates the signal tj by a linear transformation from its received signal r f . 
During the second step, the relay forwards the 2x1 vector tj using its i-th antenna. All relay antennas 
transmit simultaneously to realize concurrent transmissions of all sources' information streams. From (fTTT ). 
we can see that each source is assigned two antennas and J Alamouti DSTCs are concurrently transmitted to 
the destination. Denote y Tn as the received signal at time slot r and the n-th antenna at the destination. An 
equivalent system can be obtained as 




pi 



E 



MP + M - 

j=l:J 
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. 2 


+ 


Win 




W 2n 



vO) 



(18) 



where v\ is the z-th entry of \r^ in (fT6l) . Note that the 2x2 equivalent channel matrix Qj n has Alamouti 
structure. The equivalent system equation in (fT8l) is similar to that of a multi-user multi-antenna MAC system 
except that the equivalent noises are not white. By applying the multi-user IC schemes in |[20l , the destination 
can iteratively cancel the symbols of J — 1 interfering sources using signals at any J — I antennas. For full IC, 
N > J is required. Here, we provide a compact matrix representation of this algorithm, which is not provided 
in M20H21H . because the resulting equations are needed for the diversity analysis. Without loss of generality, we 
show how the destination cancels the information of Sources 2 to J and obtains int-free observations of Source 
1 in J — 1 iterations. 

Stack y n to obtain y = [y*, . . . , y^]* and let Gj = [G% . . . G* N ]* for j = 1, . . . , J. The iterative process 



is described as follows: 
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Initialization: Q(0) 



Gi ... G 



, y(0) = y- 



For the i-th iteration: i = 1, . . . , J — 1 

1. Form the 2(N - i) x 2(iV - i + 1) IC matrix B(i) as 



B(*) = 



ag 'j-«+i,i(i-l) 2g*j_ i+1 , 2 (i-l) 

hi,: 

2 



e.7-, +1 ,i(*-i)lF nej-i+i,2(i-i)ii" 



2a* j- 



i('-i) 



liaj-i+i,i(»-i)ii 



o 2 

2g*.j-i+i, 3 (i-l) 
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20*. 



l('-l) 



||e.7- I+ i.i(j-i)|| 





o 2 






2 




26* j 


-i+l,AT-i+l 


i-i) 


lie.7- 


+ l,Af-i + l(* 


-i)f 



(19) 



where the 2x2 matrix G p , q (i) denotes the (p, q)-th 2x2 submatrix of Q{i). 

2. Cancel the symbols of Source J — i + 1 by calculating y(i) = B(i)y (i — 1). 

3. Form the 2{N — i) x 2 J remaining equivalent channel matrix C/(i) as Q(i) = T5(i)Q(i — 1). 

Note that y(i) is the 2(N — i) x 1 signal vector after cancelling the information Source J — i + 1. After 

J — 1 iterations, y( J — 1) only contains the information of Source 1 and has dimension 2{N — J + 1) x 1. Let 

B = B(i). This iterative IC process can be expressed as a linear operation on y as 

i=l:J-l 



y(J-l) = By 



P 2 



MP + M 



BGi 



,(!) 



P 



MP + M 



BGjv (1) +Bw. 



(20) 



where w = [w* • • • w*J* and n denotes the equivalent noise vector after IC. Note that n is Gaussian but not 
white. After straightforward calculation, its covariance matrix can be calculated as 



R r 



E 

=l:Af 



f 
J i 



(1) 



,BGiG^B*+BB* 



(21) 



where c\ = \ M p +M - The ML decoding of Source l's information can be performed as 



are; mm 



By - VPciBGi 



,(i) 



,(i) 



R^ 1 I By 



PciBGi 



,(i) 



,(i) 



(22) 



Since the 2x2 submatrices of B, Gi, and R n have Alamouti structure, (|22| > can be further decoupled into 

two procedures of symbol-wise decoding following the similar argument in Subsection 13.1.11 Likewise, the 

information of the other J — 1 sources can be decoupled and decoded. In total, 2 J symbol- wise ML decoding 

procedures are required to decode all sources' information. Therefore, the decoding complexity is linear in the 

number of sources. 
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4.1.2 ConcurrentR^D-ICo for the lj x (4J)i x N x MARN 

In this subsection, we describe ConcurrentR^D-ICD for the MARN where the number of relay antennas is four 
times the number of source antennas. During the first step, Source j collects four symbols s\ J (i = 1, . . . , 4), 



,(j) „(?') 



,0') Ji) 



in which s\ , s 2 e ^ an( ^ s 3 : ; s 4 e <^'- The constellation 5' is obtained by rotating 5 



. Source 



j transmits a vector of these four symbols to the relay in four time slots, and sources timeshare the source- 
relay link. The relay obtains a soft estimate of each symbol from Source j by coherently combining signals at 
different antennas as in ([TBI , then linearly transforms this soft estimate r^) into a quasi-orthogonal DSTC by 

Aif C?) BarTil B 3 K^ A 4 r^ 



[*4j-3 *4j'-2 t4j_i 



t 4j ] = C 2 



, where the scalar c 2 



normalizes the 



MP+M 

average power at the relay to P; and Aj and Bj are the DSTC encoding matrices ifTTl , as given in ( fTOb . 

During the second step, information streams from all sources are concurrently forwarded to the destination 
by sending t j from the i-th antenna. With this design, the relay uses four of its antennas to forward the quasi- 
orthogonal DSTC of each source and the information of all sources is forwarded concurrently. Denote y rn and 
w Tn as the sampled signal and noise at the n-th destination antenna and time slot r, respectively. Following the 
analysis in 11211 . two equivalent Alamouti systems can be obtained as 
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where gr; (A; = 1,2,3,4) denotes the four channel paths from the four relay antennas that forward Source 



,(J) 



• ,~,0') 



j 's DSTC to the n-th destination antenna, i.e., g^ = 9(4j_4+fc) n ; ujjf denotes the equivalent noises at the 



relay, which can be shown to be i.i.d. CM 0, ( J2 



-1:M 



II) 



I . By applying the multi-user IC proposed for 



quasi-orthogonal STBC in Ell , the destination can cancel the symbols of Sources 2 to J for each Alamouti 
system. Denote B* as the IC matrix for system *, (* = +, — ), which can be obtained similarly following the 
iterative IC process in Subsection 14. 1.11 Let G^ = [G** • • • G*^] and y* = [y** • • • y$] . The resulting 
system equation for Source 1 's information after the IC can be expressed as 
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where w = [w+ 



w 



N 



H and n denote the 4(iV — J + 1) x 4 equivalent channel matrix 
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and the 4(iV — J + 1) x 1 equivalent noise vector, respectively. From (l23l) . it can be shown that two procedures 
of pair- wise ML decoding are sufficient to decode the four symbols of Source 1. 

4.1.3 Concurrent R ^ D -IC D for general 1 j x M x x N\ MARNs 

For general 1 j x M\ x N\ MARNs, each source transmits a vector of 2 n symbols in TDMA during the first step, 
where 2 n is the minimum number that is no less than [^j-\ . The relay constructs one 2 n x 2 n DSTC using the 
quasi-orthogonal STBCs with ABBA structure for each source H25N261 . During the second step, the first [^-\ 
columns of each DSTC are forwarded using [^j- J antennas of the relay. All DSTCs are concurrently forwarded 
to the destination. The destination separates the equivalent system into 2 n_1 Alamouti systems |fl6l , then 
decouples information of each source by IC ETTl . after which, decodes each source's information independently. 

4.2 Diversity Gain Analysis 

In this subsection, we analyze the achievable diversity gain of ConcurrentR^oTCD- As discussed in Subsec- 
tion 13.2.21 the diversity gain can be calculated using the outage probability of the instantaneous normalized 
receive SNR as in ( PT51 ). To help the presentation, we use an equivalent representation of ( fT5l ). We say that an 
instantaneous normalized receive SNR 7 provides a diversity gain of d if P(^y < e) = a\e d + o(e d ) with a\ 
independent of e. To calculate the diversity gain of ConcurrentR^D-ICD, the following lemma is used iTToTl . 

Lemma 1. Let 71,72, • • • , 7fe, 7 S be k + 1 instantaneous normalized receive SNRs. j g is independent of 7™ for 
n = 1, 2, . . . , k. 7 9 provides a diversity gain of d\; Yl In provides a diversity gain ofd2-Ifj= ^2 7 "_7^ , 

n=l:k n=l:k 

7 provides a diversity gain o/min{di, cfo}. 

Here is the theorem on the diversity gain of ConcurrentR^D-ICD- 

Theorem 2. In lj x M\ x N\ MARNs, ConcurrentR^o-ICo achieves a diversity gain o/minjM, [—J (N — 
J+l)}. 

Proof. See Appendix iBl □ 

Intuitively, the result in Theorem [2] can be explained as follows. From the protocol design, since sources 
are assigned to orthogonal channels in the first step of transmission, the maximum diversity gain that can be 
achieved in the source-relay link is M. For the second step, each source is allocated [^j- J antennas of the relay. 
Then, the transmit diversity gain is [^J • Similar to MAC systems, the destination uses J— 1 antennas to cancel 



the symbols of interfering sources and obtains the full receive diversity gain N — J + 1 at remaining antennas. 

M 

J 



Thus, the maximum achievable diversity gain in the relay-destination link is [H\{N — J + 1). The overall 
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diversity gain of ConcurrentR^oTCD is thus upperbounded by the minimum of the two values, and Theorem 
|2] shows that ConcurrentR^D-ICD achieves this upperbound. When M = 2 J and M = 4 J, the following 
corollary is obtained for special cases of Theorem 12 

Corollary 2. In the lj x (2J)i x N± MARN, Concurrent r^d-ICq achieves a diversity gain of 2 min{ J, N — 
J + 1}; In the \j x (4 J)\ x N\ MARN, Concurrent r^d-ICd achieves a diversity gain of A min{ J, N — J + 1}. 

4.3 Discussion 

In this subsection, we discuss several properties of ConcuiTent R ^ D -ICD, including the constraint on the num- 
ber of sources, the condition to achieve the int-free diversity gain, and the symbol rate. The comparison of 
ConcurrentR^D-ICD with other linear schemes is also presented. Finally, we provide an application in the 
interference relay network. 

First, we discuss the constraint on the number of sources for ConcurrentR^D-ICD- Since in this protocol, 
each source is allocated a different set of relay antennas for the concurrent transmission in the relay-destination 
link, the number of relay antennas needs to be no less than the number of sources, i.e., M > J. At the 
destination, to fully decouple signals of different sources, at least J — 1 antennas are required to cancel the 
symbols of J — 1 sources. In other words, N > J. Therefore, J < min{M, N} is required. This condition is 
the same as that for Concurrents_>R_>D-ICD as discussed in Subsection 13.31 To guarantee this condition, user 
admission control in the upper layer is needed. 

Next, we show that ConcurrentR_>D-ICD has the potential to achieve the int-free diversity gain. Recall that 
the int-free diversity gain is defined as the maximum achievable diversity gain when there is no interference 
in both links. For \j x M\ x N\ MARNs, the int-free diversity gain is min{M, MN} = M, achievable by 
TDMAs^r^d [18]. Theorem ^indicates that when 

N>-^- + J-l, (24) 

ConcurrentR^D-ICD achieves the int-free diversity gain of M. Eq. (I24l is called the int-free condition. For 
networks satisfying the int-free condition, ConcurrentR^o-ICD allows multi-source transmission in the relay- 
destination link without sacrificing the diversity gain. If M is a multiple of J, this condition can be further 
simplified as N > 2J— 1. Examples of networks satisfying the int-free condition are: 1 2 x 2\ x 3i, 1 2 x 4i x 3i, 
1 3 X 3i X 5i, and 1 3 x 61 x 5i MARNs. 

In what follows, we discuss the symbol rate of ConcuiTentR^D-ICo- In 1 j x M\ x N\ MARNs, for each 
source to transmit a vector of 2™ symbols (2 n is the minimum number no less than L'yJX 2 n J channel uses 

are needed in the first link and 2 n channel uses are needed in the second link. Thus, 2 n symbols from each 
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source are transmitted using 2 n (J + 1) channel uses from end to end. The symbol rate can thus be calculated 
as R = 2 nn+j) = TT7 symbols/source/channel use. 

ConcurrentR_>D-ICD fits the linear framework in which the relay linearly transforms its received signals 
to generate output signals without decoding and the destination decouples signals from different sources to 
separately decode each source's information. In what follows, we compare the diversity gain, symbol rate, 
and CSI requirements at the relay of ConcurrentR^oTCD with the two existing linear schemes, TDMAs->r^d 
and Concurrents-^ TCr, as well as Concurrents^-R^oTCD- The results are shown in Table [Q For MARNs 
satisfying the int-free condition, ConcurrentR^o-ICD and TDMAs->r^d achieve the maximum int-free diver- 
sity gain, higher than that of Concurrents_i>R-ICR. Both ConcurrentR_s.D-ICD and Concurrents->R-ICR achieve 
higher symbol rate compared to TDMAs^r^d- Thus, ConcurrentR^o-ICD outperforms TDMAs^r^d in 
terms of symbol rate and exceeds Concurrents^R-ICR in terms of diversity gain. For ConcurrentR_>D-ICD and 
Concurrents_>RTCR, the relay needs to know its channels with all sources, which can be obtained by training. 
For MARNs not satisfying the int-free condition, ConcurrentR^D-ICD may not achieve the int-free diversity. 
For the two proposed protocols, each has its advantage over the other: ConcurrentR_>D-ICD achieves a higher 
diversity gain, whereas Concurrents^R-s-D-ICD has a higher symbol rate. 

ConcurrentR^D-ICD can also be applied to the interference relay network with one multi-antenna relay 
and several multi-antenna destinations. The relay processes its received signals in the same way as that in 
the MARN. Each destination cancels the information of undesired sources and decodes the information of its 
interest as long as the numbers of antennas at each destination and the relay are no less than the number of 
sources. 

5 Numerical Results 

In this section, we present simulated BERs of Concurrents_ ) .R_ ) .DTCD and ConcurrentR_>D-ICD and compare 
with the BERs of other existing schemes with similar complexities and CSI requirements. Since the average 
power constraints at all nodes are equal to P and noises are normalized, the average transmit SNR at each node 
is P. For all figures, the horizontal axis represents the average transmit SNR, measured in dB; the vertical axis 
represents the BER. 

In Fig. HI the BER of the first proposed scheme, Concurrents->R->D-ICD> is demonstrated for 6 MARNs: 
1 2 x 2i x 2i, 1 2 x 2i x 3i, 1 2 x 2i x 4i, 1 2 x 4i x 2i, 1 2 x 4i x 3i, 1 2 x 4i x 4i, and 1 3 x 4i x 3i. BPSK 
modulation is used. Fig.@]shows that the scheme achieves a diversity gain of 1 in the 1 2 x 2\ x 2\, 1 2 x 2\ x 3i, 
and 1 2 x 2i x 4i MARNs. Additional array gain can be achieved when the number of destination antennas 
is increased. In the I3 x 4i x 3i MARN, the diversity gain is 2; while in the 1 2 x 4i x 2\, 1 2 x 4i x 3i, 
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and I2 x 4i x 4i MARNs, the diversity gain is slightly less than 3. This is because of the log P factor in the 
error rate formula iflTll . As P increases, the diversity gain approaches 3. These results justify the validity of the 
diversity upperbound presented in Theorem Q] and show the achievability of the upperbound for these network 
scenarios. Comparing the results for the I3 x 4i x 3i and I2 x 2\ x 3i MARNs, we can see that the number 
of sources that a MARN can accommodate and the diversity gain of a MARN can be improved simultaneously 
by increasing the number of relay antennas. From the results for the I2 x 4i x 3i and I3 x 4i x 3i MARNs, 
we conclude that with a fixed number of relay antennas, the diversity gain decreases as the number of sources 
in the network increases. 

Fig. [5]exhibits the BER of the second proposed scheme ConcurrentR_>DTCD in 8 MARNs: I2 X 2\ x 2\, 

12 x 2i x 3i, I2 x 2i x 4i, I3 x 3i x 3i, I3 x 3i x 5i, I2 x 4i x 2\, I2 x 4i x 3i, and I2 X 81 x 2\. In all 
scenarios, BPSK modulation is used. For MARNs with parameters I2 x 2\ x 2\, I3 x 3i x 3i, I2 x 4i x 2\, 
and I2 x 81 x 2\, ConcurrentR^D-ICD achieves the diversity gains of 1, 1, 2, and 4, respectively. Note that 
the int-free diversity gains in these networks are 2, 3, 4, and 8, respectively. ConcurrentR-j^TCD does not 
achieve the int-free diversity gain for these scenarios. For MARNs with parameters I2 x 2\ x 3i, I2 x 2\ x 4i, 

13 x 3i x 5i, and I2 x 4i x 3i, ConcurrentR^D-ICD achieves the diversity gains of 2, 2, 3, and 4, respectively, 
which are the int-free diversity gains. The parameters of these four networks satisfy the int-free condition given 
in Eq. (l24l . The simulation results for these eight networks justify our diversity result in Theorem [2] 

In the following, we compare the proposed Concurrents^R^D-ICD (Scheme 1) and ConcurrentR^rj-ICD 
(Scheme 2) with other schemes: Concurrents^R-ICR (Scheme 3), TDMAs^r^d (Scheme 4), ConcurrentR^D- 
DR-ICD(Scheme 5), and Concurrents_>.R->.D (Scheme 6). Schemes 3 and 4 are introduced in Section 1. To 
compare our methods with schemes having decoding at the relay, Scheme 5 is introduced. It is similar to 
Scheme 2 except that the relay conducts the ML decoding based on the soft estimate in (fl~6l ). After that, symbols 
are re-encoded and forwarded to the destination using the same constellation. Scheme 6 is similar to Scheme 1, 
but in Scheme 6 the destination jointly decodes all sources' information without IC. Note that Schemes 1,2,3,4 
satisfy the constraints of the linear framework, but Schemes 5 and 6 do not fit the linear framework and have 
higher complexity than the other four schemes. For fair comparison in the numerical experiments, we fix the 
bit rate to be 1 bit/source/channel use regardless of the scheme and plot the BERs of the schemes as a function 
of the average transmit SNR. Thus, QPSK, 8PSK, 8PSK, 16PSK, 8PSK, and QPSK are used for Schemes 1, 2, 
3, 4, 5, and 6, respectively. 

Figs. [6l|7l and[8]show BERs of these schemes in the 1 2 x 2i x 2i, 1 2 x 2\ x 3i, 1 2 x 4 X x 3i MARNs, 

respectively. We compare the BERs of the four linear schemes. We first look at the I2 x 2\ x 2\ MARN 

whose BERs are shown in Fig. [6] Only Scheme 4 achieves the maximum int-free diversity, thus it has the best 

performance at high SNR (26 dB and up). The other three schemes have a diversity gain of 1. Scheme 3 has 

20 



the lowest BER for SNR less than 26 dB, because of its high signal to interference-plus-noise ratio (SINR) at 
the destination. Thus, for the I2 X 2i x 2i MARN, the proposed schemes, Schemes 1 and 2, are inferior in 
BER. The next is the I2 x 2i x 3i MARN. We can see from Fig. [7] that only Schemes 2 and 4 achieve the 
maximum int-free diversity, while Scheme 2 has lower BER than the other schemes for all the simulated SNR 
values. Its advantage over Scheme 4 is about 5 dB. This is because Scheme 2 has a higher symbol rate. For the 
same bit rate, it can use a smaller constellation, which provides higher array gain. Thus, for the I2 X 2i x 3i, 
Scheme 2 is the best. Finally, in the I2 x 4i x 3i MARN shown in Fig. [8) Scheme 2 has the highest diversity 
gain. When SNR is higher than 23 dB, Scheme 2 has the lowest BER. Scheme 1 outperforms the other three in 
the SNR regime from 17 to 23 dB, because it has the highest symbol rate and uses the smallest constellation to 
achieve the same bit rate. When the SNR is smaller than 17 dB, Scheme 3 has the lowest BER. Therefore, for 
the I2 x 4i x 3i MARN, our proposed two schemes have lower BER compared to the existing schemes when 
SNR is higher than 17 dB. We can conclude from the three experiments that the relative quality of the four 
schemes depends on the network parameters and SNR range. The proposed Scheme 1, Concurrents->R^D-ICD, 
is expected to have good reliability in the low to moderate SNR range, as observed in Fig. [8] The proposed 
Scheme 2, ConcurrentR^D-ICD, is expected to have good reliability for MARNs whose relay-destination link 
is much stronger than the source-relay link (e.g., the I2 x 2\ x 3i MARN). These are due to the nature of the 
design explained in Section [3] and Section [4] 

In what follows, we compare the proposed schemes with Schemes 5 and 6, which do not satisfy the linear 
constraints. We first compare Scheme 1 with Scheme 6. Note that the ML decoding of Scheme 1 is symbol- 
wise in the I2 x 2\ x N\ MARN and pair-wise in the I2 x 4i x N\ MARN, while for Scheme 6, the destination 
needs to jointly decode four symbols in the I2 x 2\ x N\ MARN and eight symbols in the I2 x 4i x N\ 
MARN. For networks with large J and M, the decoding complexity of Scheme 6 is exponential in JM, thus 
becomes impractical. For Scheme 1, the decoding complexity is linear in J and exponential in M/2, thus is 
much lower. Figs. [6l |7J and [8] show that this extra decoding complexity can improve both diversity gain and 
array gain. The diversity gain improvements are 1 in all three networks. For the I2 x 4i x 3i MARN (Fig. [8]), 
the array gain improvement is the smallest (about 3 dB at BER= 10 -2 ) compared to the other two networks. 
Therefore, compared to Scheme 6, Scheme 1 is desired in large networks to trade performance degradation 
for lower complexity. Then, we compare Scheme 2 with Scheme 5 and see whether the extra decoding at the 
relay can provide better performance. For the three networks shown in Figs. [6l Ul and [U Scheme 2 has the 
same diversity gain as Scheme 5. For the I2 x 2\ x 2\ MARN (Fig. [6]), Scheme 2 has approximately the same 
performance as Scheme 5 for all the simulated SNR values. For the I2 x 2\ x 3i MARN (Fig. 13), Scheme 2 has 
the same performance as Scheme 5 in the high SNR regime while is about 1 dB worse in the low to moderate 

SNRs. For the 1 2 x 4i x 3i MARN (Fig. H]l, Scheme 2 is approximately 2 dB worse for all SNRs. These 
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observations can be explained as follows. For the I2 x 2i x 2i MARN, with Scheme 2, the BER of the network 
is mainly constrained by the second hop (with IC at the destination, the second hop has only a diversity gain of 
1, but the first hop has a diversity gain of 2). The extra decoding at the relay only improves the performance in 
the first hop, does not help the overall performance much. For the I2 x 2i x 3i and I2 X 4i x 3i MARNs, with 
Scheme 2, the two links have similar qualities (both links have diversity gain 2 for the I2 x 2i x 3i MARN and 
4 for the I2 x 4i x 3i MARN). The extra decoding complexity at the relay can provide a better performance. 

6 Conclusions 

This paper studies multi-source transmission schemes for lj x M\ x N\ MARNs. For complexity consider- 
ations, a linear framework is introduced, where the relay conducts linear transformation without decoding and 
the destination decouples signals from different sources so that the decoding complexity is linear in the number 
of sources. We propose two protocols that use multi-antennas at the destination to resolve multi-source inter- 
ference. The protocol of Concurrents_>.R_>.DTCD allows concurrent transmission of information streams from 
multi-sources in both the source-relay link and the relay-destination link. The relay performs DSTC and does 
not require any CSI. The destination uses the multi-antenna IC technique to decouple signals from different 
sources. Concurrents^R^DTCo achieves a symbol rate of 1/2 symbols/source/channel use, but its diversity 
gain is shown to be upperbounded by M — J + 1. Thus, for this protocol, the diversity gain degradation is 
necessary to trade for symbol rate. To improve the diversity gain, we propose ConcurrentR^DTCo, in which 
concurrent transmission is allowed in the relay-destination link but TDMA is used in the source-relay link. 
After receiving signals from the sources, the relay first conducts MRC to maximize the SNR of each source 
then concurrently transmits all sources' information to the destination using DSTC. At the destination, IC is 
performed to decouple signals from different sources before decoding. Through analysis and simulations, it is 
shown that ConcuiTentR^D-ICn achieves a diversity gain of min {M, [^J (N — J + 1)} with a symbol rate 
of jq-j-. When TV > 2 J — 1, ConcurrentR^D-ICD achieves the same maximum int-free diversity gain of the 
network but with a higher symbol rate, compared to a full TDMA scheme. 

Appendix 

A Proof of Theorem Q] 

To prove this theorem, it suffices to find an upperbound on the instantaneous normalized receive SNR. We first 
show the scenario of J = 2 and M = 2, then its generalization. 
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From (O, the noise covariance matrix R n can be lowerbounded by R n y BB*. Then, the instantaneous 



normalized receive SNR for s\ can be upperbounded as 
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where the first equality holds because 3*E = S from the definition of projection; f v> is a 2 x 1 channel 
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to the null space of f( 2 ). Clearly, the random variable g is Gamma distributed with degree 2N. Next, we 
show that given f(% / is Gamma distributed with degree 1. Note that the two eigenvalues of are 1 and 
0. Then, f( 1 )*©f( 1 ) = f( 1 )*uiU*fM, where ui is the eigenvector corresponding to the eigenvalue 1. Since 
© depends on f( 2 ) only, ui is independent of f^ 1 ^. Thus, given v?\ u^ 1 ) is CJ\f(0, 1) distributed and / is 
Gamma distributed with degree 1 . The outage probability of 7 can be bounded as 
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where a is a constant independent of e. By (fT5T ). the diversity gain is upperbounded by one. 

For MARNs with general J and M, the IC operation at the destination similarly creates a virtual ZF op- 
eration at the relay as discussed in Subsection 13.2. II The virtual ZF matrix <1> nulls out F^\ j = 2, . . . , J. 
Following a similar process, it can be shown that the instantaneous normalized receive SNR is upperbounded 
by a product of two parts as in d26l ): the first part depends on gi n only, the second part is equal to ft 1 )* ©ft 1 ) 
where f ^>> is the M X 1 channel vector from Source j to the relay and is a projection matrix onto the null 
spaces of f ( 2 ) to f ( J \ Similarly, the diversity gain can be shown to be no higher than M — J + 1. 



B Proof of Theorem H 

We first show the case that M = 2 J, then M = 4J, followed by the general scenario. When M = 2 J, the 

channel vector experienced by s ± is Bgi from (l20l ). where gi denotes the first column of Gi. Note that the 
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noise covariance matrix is given in (|2TI ). By the definition of instantaneous normalized receive SNR, we have 
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From ([27]> to d28), the matrix inversion lemma is applied. For (|29]>, we use the fact that G^B*(BB*)~ 1 BGi 
is a Hermitian matrix with Alamouti structure. Thus, G^B*(BB*) _1 BGi is a 2 x 2 diagonal matrix whose 
diagonal entries are equal to g^B*(BB*) _1 Bgi. Eq. (|3QT > follows from (1291 ) because the second entry of the 



vector g^B^BB^^BGi is zero. Let y = g^B*(BB*) ^Bgi and x = £ 
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which is a scaled harmonic mean of variables x and ycf. Since x is the sum of M independent random variables 
with exponential distribution, x is Gamma distributed with degree M. Thus, x has a diversity gain of M. For 
yc\, when P 3> 1, c\ 
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-t=. From the iterative algorithm and (fl9l >. B depends on G jr 



for j = 2, . . . , J. On the other hand, gi only depends on Gi n . Thus, B*(BB*) _1 B and gi are independent. 
Because B zero-forces G2 to Gj, it can be shown that B*(BB*) _1 B is a projection matrix onto the null 
space of the subspace spanned by columns of Gj -for j = 2, . . . , J. Then, y is Gamma distributed with degree 
2(iV — J + 1), implying c\y has diversity gain 2{N — J + 1). Let k = 1 in Lemma 1. The diversity gain 
of 7 is the smaller of the diversities of x and y. Then, the achievable diversity gain of ConcurrentR^D-ICo is 
min{M, 2(N - J + 1)} = 2 min{ J, iV - J + 1}. 

For M = 4J, the instantaneous normalized receive SNR can be shown to be 7 = tr (H*R t>j 1 H) |[T6l , 
where H denotes the equivalent channel matrix and Rn denotes the covariance matrix of n in (T23T ). After 
straightforward calculation, we have 
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By similar calculation procedures from d27T ) to 001 ), it follows that 

xy xz 



7 



+ 



x + 2c\y x + 2c|z' 
24 



(32) 



where x = £ ff>,y = g+*B+*(B+B+*)- 1 B+g+, and z = g5~*B"*(B B *) X B gf , with gj 

i=l:M 

the first column of G^ for * = +, — . The random variable x is Gamma distributed with degree M and 
has a diversity gain of M, The random variables y and z are independent and both have a diversity gain of 
2 (TV — J + 1). Then, y + z has a diversity gain of 4(iV — J + 1). Let fc = 2 in Lemma 1. The achievable 
diversity gain of ConcurrentR^D-ICD is min{M, A(N — J + 1)} = 4 min{ J, TV — J + 1}. 

For MARNs with a general M, the relay encodes the information of one source using one quasi-orthogonal 
DSTC with ABBA structure 112511261 and forwards each codeword by [^j-\ of its antennas. The destination 
conducts the multi-user IC technique ll2~l~ll . The proof for this general case is a straightforward extension of the 
proofs for the cases that M = 2 J and M = 4 J. Thus, the diversity result in Theorem |2]follows. 

References 

[1] J. Laneman and G. Wornell, "Distributed space-time-coded protocols for exploiting cooperative diversity in wireless network," 
IEEE Tran. on Info. Theory, vol. 49, pp. 2415-2425, Oct. 2003. 

[2] Y. Jing and B. Hassibi, "Distributed space-time coding in wireless relay networks," IEEE Trans, on Wireless Comm., vol. 5, pp. 
3524-3536, Dec. 2006. 

[3] K. Azarian, H. Gamal, and P. Schniter, "On the achievable diversity-multiplexing tradeoff in half-duplex cooperative channels," 
IEEE Trans, on Info. Theory, vol. 51, pp. 4152-4172, Dec. 2005. 

[4] J. Laneman, D. Tse, and G. Wornell, "Cooperative diversity in wireless networks: Efficient protocols and outage behavior," IEEE 
Trans, on Info. Theory, vol. 50, pp. 3062-3080, Dec. 2004. 

[5] L. Venturino, X. Wang, and M. Lops, "Multiuser detection for cooperative networks and performance analysis," IEEE Trans, on 
Signal Processing, vol. 54, no. 9, pp. 3315 -3329, Sep. 2006. 

[6] O. Oteri and A. Paulraj, "Multicell optimization for diversity and interference mitigation," IEEE Trans, on Signal Processing, 
vol. 56, no. 5, pp. 2050 -2061, May 2008. 

[7] K. Zarifi, S. Affes, and A. Ghrayeb, "Large-system-based performance analysis and design of multiuser cooperative networks," 
IEEE Trans, on Signal Processing, vol. 57, no. 4, pp. 1511 -1525, Apr. 2009. 

[8] A. O. Yilmaz, "Cooperative multiple-access in fading relay channels," in Proc. of IEEE ICC, Istanbul, Turkey, Jun. 2006. 

[9] V. Morgenshtern, H. Bolcskei, and R. Nabar, "Distributed orthogonalization in large interference relay networks," in Proc. of 
International Symposium on Information Theory, 2005., Adelaide, Australia, Sep. 2005, pp. 1211 -1215. 

[10] A. Wittneben and B. Rankov, "Distributed antenna systems and linear relaying for gigabit MIMO wireless," in Proc. of IEEE 
Vehicular Technology Conference VTC, Los Angeles, USA, Fall 2004. 

[11] A. Wittneben, "Coherent multiuser relaying with partial relay cooperation," in Proc. of IEEE WCNC, Las Vegas, NV, USA, Apr. 
2006. 

[12] B. Niu, O. Simeone, O. Somekh, and A. M. Haimovich, "Throughput of two-hop wireless networks with relay cooperation," in 
Proc. ofAllerton Conference, Monticello, IL, Sep. 2007. 

25 



[13] S. Berger and A. Wittneben, "Cooperative distributed multiuser MMSE relaying in wireless Ad-Hoc networks," in Asilomar 
Conference on Signals, Systems, and Computers 2005, Pacific Grove, CA, Nov. 2005. 

[14] A. El-Keyi and B. Champagne, "Cooperative MIMO-beamforming for multiuser relay networks," in Proc. of IEEE 1CCASP , Las 
Vegas, NV, Apr. 2008. 

[15] O. Oyman and A. Paulraj, "Power-bandwidth tradeoff in dense multi-antenna relay networks," IEEE Trans, on Wireless Comm., 
vol. 7, pp. 2282-2292, Jun. 2007. 

[16] L. Li, Y. Jing, and H. Jafarkhani, "Interference cancellation at the relay in multi-access wireless relay networks," submitted to 
IEEE Trans, on Wireless Comm., also available on http://arxiv.org/abs/1004.3807, Apr. 2010. 

[17] Y. Jing and H. Jafarkhani, "Using orthogonal and quasi-orthogonal designs in wireless relay networks," IEEE Trans, on Info. The- 
ory, pp. 4106^4-118, Nov. 2007. 

[18] Y. Jing and B. Hassibi, "Diversity analysis of distributed space-time codes in relay networks with multiple transmit/receive 
antennas," EURASIP Jour, on Advances in Signal Proc, vol. 2008, 2008, article ID 254573, 17 pages, doi: 10. 1155/2008/254573. 

[19] A. Naguib, N. Seshadri, and A. Calderbank, "Applications of space-time block codes and interference suppression for high 
capacity and high data rate wireless systems," in Proc. of Asilomar Conf, Pacific Grove, CA, Oct. 1998. 

[20] A. Stamoulis, N. Al-Dhahir, and A. Calderbank, "Further results on interference cancellation and space-time block codes," in 
Proc. of Asilomar Conf, Pacific Grove, CA, Oct. 2001. 

[21] J. Kazemitabar and H. Jafarkhani, "Multiuser interference cancellation and detection for users with more than two transmit 
antennas," IEEE Trans, on Comm., pp. 574-583, Apr. 2008. 

[22] S. Sun and Y Jing, "Channel training and estimation in distributed space-time coded relay networks with multiple transmit/receive 
antennas," in Proc. of IEEE WCNC, Sydney, Australia, Apr. 2010. 

[23] J. Kazemitabar and H. Jafarkhani, "Performance analysis of multiple- antenna multi-user detection," Information Theory and 
Applications Workshop, Jan. 2009. 

[24] H. Jafarkhani, Space-Time Coding: Theory and Practice. Cambridge University Press, 2005. 

[25] , "A quasi-orthogonal space-time block codes," IEEE Transactions on Communications, vol. 49, pp. 1- 4, Jan. 2001. 

[26] O. Tirkkonen, A. Boariu, and A. Hottinen, "Minimal nonorthogonality rate 1 space-time block code for 3+ Tx antennas," in Proc. 
IEEE 6th Int. Symp. Spread-Spectrum Techniques and Applications (ISSSTA 2000), Parsippany, NJ, USA, Sep. 2000. 

[27] L. Li, Y. Jing, and H. Jafarkhani, "Using instantaneous normalized receive SNR for diversity gain calculation," CPCC Technical 
Report, available at http://escholarship.org/uc/item/951 lq6pf, Sep. 2010. 

[28] N. Sharma and C. Papadias, "Improved quasi-orthogonal codes through constellation rotation," IEEE Transactions on Communi- 
cations, vol. 51, no. 3, pp. 332 - 335, Mar. 2003. 

[29] W. Su and X.-G. Xia, "Signal constellations for quasi-orthogonal space-time block codes with full diversity," IEEE Transactions 
on Information Theory, vol. 50, no. 10, pp. 2331 - 2347, Oct. 2004. 



26 



Table 1 The diversity gain and symbol rate performance for linear schemes. (The schemes marked with 
* are proposed in this paper.) 
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Figure 1 System block diagram of Concurrents^-R^D-ICo- 
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Figure 2 Equivalent system of Concurrents_>.R^D-ICD with zero-forcing at the relay. 
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Figure 4 BER performance of Concurrents^R_>D-ICD> using BPSK modulation. 
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Figure 5 BER performance of ConcurrentR_>D-ICD, using BPSK modulation. 
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Figure 6 Performance comparison in a I2 X 2i x 2i MARN, 1 bit/source/channel use for all schemes. 
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Figure 7 Performance comparison in a I2 x 2i x 3i MARN, 1 bit/source/channel use for all schemes. 
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Figure 8 Performance comparison in a 12 x 4i x 3i MARN, 1 bit/source/channel use for all schemes. 
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